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Abstract 

Nonlinear Kalman Filters are powerful and widely-used techniques when trying to estimate the 
hidden state of a stochastic nonlinear dynamic system. In this paper, we extend the Smart Sampling 
Kalman Filter (S^KF) with a new point symmetric Gaussian sampling scheme. This not only improves 
the S^KF’s estimation quality, but also reduces the time needed to compute the required optimal 
Gaussian samples drastically. Moreover, we improve the numerical stability of the sample computation, 
which allows us to accurately approximate a thousand-dimensional Gaussian distribution using tens 
of thousands of optimally placed samples. We evaluate the new symmetric S^KF by computing 
higher-order moments of standard normal distributions and investigate the estimation quality of the 
S^KF when dealing with symmetric measurement equations. Finally, extended object tracking based 
on many measurements per time step is considered. This high-dimensional estimation problem shows 
the advantage of the S^KF being able to use an arbitrary number of samples independent of the state 
dimension, in contrast to other state-of-the-art sample-based Kalman Filters. 


1. Introduction 

Estimating the hidden state of a stochastic dynamic system based on noisy measurements is 
crucial for many applications in control, object tracking, or robotics. When considering linear systems 
corrupted by additive Gaussian noise, the Kalman Filter (KF) is the optimal estimator with respect to 
the mean square error [T]. Unfortunately, most practical problems are nonlinear, making closed-form 
solutions intractable. Consequently, approximative approaches have to be used. Particle Filters 
(PFs) m El la [5] try to approximate the complete, in general multimodal, system state density 
with a set of weighted particles. This comes at the cost of computational complexity due to the 
curse of dimensionality. Another problem is sample degeneracy, in particular for high-dimensional 
state spaces, as a consequence of the particle reweighting using the likelihood function. To reduce 
computational complexity and circumvent the problem of sample degeneracy, the Progressive Gaussian 
Filter (PGF) [B] approximates the system state as a Ganssian and moves the particles automatically to 
the important regions of the state space. Nevertheless, those nonlinear filters are still costly compared 
to linear filters applied to nonlinear problems. 

The Extended Kalman Filter (EKF) explicitly linearizes the underlying models around the current 
state estimate to be able to apply the standard KF to the considered problem [?]• Iterated variants 
of the EKF (lEKF) try to improve the EKF approach by iteratively searching for a more suitable 
point for the model linearization [7] . A more suitable way of model linearization is based on statistical 
linearization, which can be performed in the best case analytically or, in all other cases, by exploiting 
samples in the form of Linear Regression Kalman Filters (LRKFs) [8]. LRKFs obtain the reqnired 
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moments by propagating samples through the system and measurement models and computing 
sample mean and sample covariance matrix, respectively. The most commonly used LRKF is the 
Unscented Kalman Filter (UKF) [9]. Its samples are, however, limited in number and placement, 
and several attempts exist to improve the UKF by hnding its optimal parameter settings for specific 
estimation problems [TO]. Nevertheless, the additional computational time required to find proper UKF 
parameters can be used instead to propagate more carefully chosen samples through the models in order 
to improve the estimation quality. For example, the Gauss-Hermite Kalman Filter (GHKF) introduced 
in m is based on the Gauss-Hermite quadrature rule to generate its samples. Unfortunately, the 
GHKF also suffers from the curse of dimensionality, and hence, is not well suited for larger state 
spaces. The fifth-degree Gubature Kalman Filter (CKF) [TO] relies on a fifth-degree spherical-radial 
integration rule to construct its samples. However, by design, the number of samples still grows 
quadratically in the state dimension making the hfth-degree CKF computational burdensome when 
dealing with larger state spaces. A non-deterministic LRKF was proposed with the Randomized UKF 
(RUKF) |13l I14| . Here, an arbitrary number of randomly scaled and rotated UKF sample sets are 
combined to a single set of samples. On the one hand this has the advantage of being able to change 
the employed number of samples. On the other hand it prohibits a reproducible filter behavior and 
imposes an additional runtime overhead compared to other LRKFs due to the creation of several 
random orthogonal matrices per prediction and measurement update. The estimation quality of any 
LRKF, regardless of the sampling it is based on, can be improved by using the iterated statistical 
linearization approach |15j . A more detailed overview of linear Liters and LRKFs can be found 
in [TO[[II]. 

Recently, the Smart Sampling Kalman Filter (S^KF) was proposed in [TO] [TO], and already 
successfully used for Simultaneous Localization and Mapping (SLAM) in [TO]. The S^KF uses optimal 
deterministic sampling of a standard normal distribution comprising an arbitrary number of samples 
based on a combination of the Localized Cumulative Distribution (LCD) and a modified Cramer-von 
Mises distance [sniiii]. The same LCD approach was also extended to approximate arbitrary Gaussian 
mixture distributions [22]. In this paper, we extend the S^KF with point symmetric Gaussian sampling. 
This new symmetric sampling reflects the point symmetry of the Gaussian distribution and allows 
for matching all odd moments of a standard normal distribution exactly, which results in a more 
accurate state estimation. We also improve the numerical stability of the LCD approach. As a result, 
it is now possible to compute an optimal approximation of a thousand-dimensional standard normal 
distribution comprising tens of thousands of samples. Moreover, due to the exploited point symmetry, 
the required number of samples that have to be optimized is reduced by half. Consequently, the new 
samples can be computed much faster. In this regard, the S^KF catches up to state-of-the-art LRKFs 
as all of them also rely on a point symmetric sampling. 

The remainder of the paper is organized as follows. First, we give a detailed problem formulation 
including the general recursive Bayesian estimation, its transition to Nonlinear Kalman Filters, and 
finally their approximation using LRKFs. After that, in Sec.]^ we introduce a new point symmetric 
version of the S^KF. In Sec. we evaluate the new symmetric S^KF by computing higher-order 
moments of multivariate standard normal distributions, showing the advantage of the new point 
symmetric sampling scheme when dealing with symmetric measurement equations, and performing 
extended object tracking. Finally, conclusions are given in Sec. 

2. Problem Formulation 

We consider estimating the hidden state of a discrete-time stochastic nonlinear dynamic system, 
where the system model 

= afc(^fc-i>ILfc) (1) 


2 


describes its temporal evolutior|^ Additionally, we receive noisy measurements that are assumed 
to be generated according to the measurement model 

y^ = hk{xk,yk) . (2) 

Thus, the received measurements are realizations of the random variable y^. The noise variables 
Wf, and Vj^ are assumed to be Gaussian and independent of the system state for all time steps. 

Our goal is to determine a state estimate of in the form of a conditional state density 

fkixk) ■■= = fixk\yf,,---,y^) 

recursively over time using Bayesian inference. Such a recursive estimator consists of two parts, namely 
the prediction step and the filter step. On the one hand, the prediction step propagates the state 
estimate f^_i{xk-i) from time step A; — 1 to the current time step k by employing the system model 
Q resulting in the predicted state density 

fkixk) ■■= = f{xk\yf._^,---,y^) ■ 


On the other hand, the filter step incorporates a newly received measurement y^ into this propagated 
state estimate fj^{xk) with the aid of the measurement model (i). 

To compute the prediction step, Q has to be transformed into a state transition density according 
to 

fkixk I := j 6{xf, - afc(xfc_i, Wfc)) • dw,, , 

where (5(-) denotes the Dirac delta distribution and the system noise density 

fk{wk)=N{wt,]Wk,C'^) . 


Based on this, the predicted state density can be obtained with the Chapman-Kolomogorov equa¬ 
tion 


fkixk) = J fkixk I Tfc-i) • fLi{xk-i) 

S{xk - wffc)) • fk-iixk-i) ■ fkiwk) dTfc-i dwfc . 


( 3 ) 


Concerning the filter step, Bayes’ rule serves as its fundamental basis and the posterior, i.e., 
filtered, state density is obtained according to 


, ftiy^lxk)- f^ixk) 


(4) 


where denotes the likelihood function and f^{-) the measurement distribution given all measure¬ 
ments up to the time step A; — 1. As /f (•) is independent of x^, it is only a normalization constant 
assuring that /^(xj.) is a valid density function. The required likelihood function is obtained in a 
similar manner to the state transition density according to 

fk ilk I ■= J Klk - hkixk.yk)) ■ fkivk) dvk , (5) 

with the measurement noise density 


fkivk) =Miyk\ykCl) 


^The subscript k denotes the discrete time step and vectors are underlined. 
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Given an initial state estimate /o(xq), the recursive state estimation of is performed with 

the alternate use of the prediction step (§ and the filter step Q. However, computing both steps 
analytically is almost always impossible except for special cases such as linear models corrupted by 
additive Gaussian noise. Gonsequently, approximative solutions of the recursive Bayesian estimation 
have to be used instead. 

2.1. Nonlinear Kalman Filters 

Besides the popular Particle Filters, an important approximative Bayesian estimation technique is 
given by the class of Nonlinear Kalman Filters. These filters make two simplifications of the general 
Bayesian estimator described above. 

First, the filtered as well as the predicted state estimates are always approximated as Gaussians. 
Hence, the predicted state density is given by 

flixk) , 


with predicted state mean 


'•p 

Kk = 


Xk ■ fl{Xk)^X,, 

ak{xk-i,Wk) • fk-i{xk_i) ■ fkim) dxk_i dwk 


( 6 ) 


and predicted state covariance matrix 

Cf = y {xk - xl) • {xk - xiy • fj^ixk) dxk 

= {ak{xk-i,Wk) - xl) ■ {ak{xk_i,Wk) - xl)'^ • /l-ife-i) ' fkim)(^Xk_idWk 


( 7 ) 


respectively. 

Second, the Bayesian hlter step Q can be reformulated in form of the joint density of 

predicted state Xk and measurement according to 


fkixk) = 


fkNXk,yk\yk.i..i) 

fUlklh-iA^ 


In Kalman filtering, this joint density is also approximated as a Gaussian resulting in the approximative 
Gaussian posterior state density 


fkiXk) 

with posterior state mean 


v( 

Xk 


■ Cf 



[h\ 

[ikl 


^ cl\ 

) 




= A7(xfc;x|,C|) 


xl = xl + Cl'y ■ [Cl) ■ {y^ - yj 

and posterior state covariance matrix 


ci = ci-ci’y.{ci)-^.{cr)-^ , 


( 8 ) 

(9) 
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which are the well-known Kalman Filter formulas [7] • In order to obtain Q and Q, the measurement 
mean 

yk = jIk-J'kiik) ^Vk 

= jjhkixk,Vk) ■ flixk) ■ fkiUk) dxfcdvfc , 

the measurement covariance matrix 


( 10 ) 


Cfc = I iVk.- yJ • • /KyJ dy 


= jjihk{xk,Vk) - If,) ■ ihkixk,Vf,) - y^)'^ ■ fl{xf,) ■ flivf,) dxk , 


( 11 ) 


as well as the cross-covariance matrix of predicted state and measurement 


= 11 (xk- xl) ■ {y^ - y^)'^ ■ f^’^ixf,, yj dx^ dy 






( 12 ) 


{xk - xl) ■ ihf,{xf,,yf,) - y^)'^ ■ fl{xf,) ■ f^iy^) dx^ dx^ 


are required. 

In summary, running a Nonlinear Kalman Filter boils down to computing two integrals for the 
prediction step and three integrals for the filter step. Moreover, note that no explicit likelihood 
function ([^ is required at all, which eases filter development. 

2.2. Linear Regression Kalman Filters 

Unfortunately, computing the above integrals in closed-form is only possible for a small set of 
system and measurement models. This includes for example polynomials, trigonometric and, of 
course, linear functions (leading to the classical linear Kalman Filter). The closed-from integration 
yields the best possible Kalman Filter for the given models. In all other cases, numerical integration 
methods have to be applied, which can result in a reduced estimation performance and an increased 
computational demand. 

As we aim for an online estimation technique, the employed integration method has to possess a 
real-time capable computational complexity and still deliver adequate integration results in order to 
obtain a good recursive state estimation quality. When looking at the five integrals in Sec. 2.1 it can 


be seen that the last terms are always a product of two independent Gaussian densities, namely 


fk-liXk-l) ■ fk{Wk)=M 


Xk-l 

m 


^k-l 

Mk 


'k-1 

0 


0 


(13) 


for the prediction and 


fkixk) ■ fkiUk) = AA 


( 

Xk 


Cl 0 

V 

Lk 

Lk 

0 Cl 


(14) 


for the measurement update, respectively. 

By exploiting this fact, an efficient, i.e., fast but still accurate, computation of the integrals is 
possible. This can be done by replacing the occurring Gaussian distributions (13) and (14) with 


proper Dirac mixture densities, that is, sample-based density representations, and evaluating the 
system model ([^ and measurement model ([^ using these samples. As a result, emphasis is directly 
put on the important regions of the state space, and the regions covered by only a small portion of the 
probability mass of the Gaussian densities are neglected. This approach leads to the class of Linear 
Regression Kalman Filters (LRKFs). 
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A Dirac mixture approximation of a given probability density function fk{sk) comprising 
samples with sample positions ^ and sample weights is defined as 


Mk 

S{Sk- Sk^i) , (15) 

i=l 


where for the sample weights 

Mk 

^ ^ Ckfcji — 1 
i=l 

holds. Such an approximation can be computed in several ways, e.g., by simply using random sampling 
or deterministic approaches such as done by the UKF. 

Now, we assume that an approximation 


Mk 

i=l 


^fc-1 



(16) 


of the Gaussian joint density (13) comprising Mk samples with positions m/i J ' weights 


cj_i 

Ok^i is at hand. By replacing the Gaussian joint density in the integrals ([^ and ([^ with the Dirac 
mixture approximation (16), and using the Dirac sifting property, we obtain an approximation for the 
predicted state mean 

Mk 

^ ^ ' Oiki^Hik—ljii iHkji) 

i=l 


- p 


and the predicted state covariance matrix 


Mk 

Cf -'^ak,i ■ {akixk-i^i^Wk^i) - ^k) • -^D"^ • 

i=l 

The same procedure is used for computing the integrals required for the measurement update. First, 
a Dirac mixture approximation 


Mk 

^ ^ Oik,i ■ ^ 
i=l 


Xk 

.Xk. 



(17) 


of the Gaussian (14) encompassing Mk samples with positions [xjj, and weights is computed. 

Second, by replacing the joint Gaussian with its Dirac mixture approximation in the three integrals 
(0, and ( |12[ ), and using once more the Dirac sifting property, we get an approximation for the 
measurement mean 

Mk 

0^k,i • hki^k ii^lk^i) , 

i=l 


y-k 


the measurement covariance matrix 


Mk 

^k-Yl ■ ihixk^i, Vk^i) - y^) ■ {hkixk^i ,, 

and the cross-covariance matrix 


Mk 

■ ixk,i - xl) ■ ih^ixf,^i,Vk^i) - y^)^ 

i=l 
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It should be noted that the number of samples for the time and the measurement update do not 
have to be the same. Moreover, the Dirac mixture approximations (16) and (0 can be completely 
different in the way they are obtained, although this is usually not the case. 


3. The Smart Sampling Kalman Filter with Symmetric Samples 


In |20j, the authors proposed an approach based on the Localized Cumulative Distribution (LCD) 
to optimally approximate Gaussian distributions with a set of equally weighted samples. This is 
done by transforming the approximation problem into an optimization problem. Unfortunately, 
such optimization is very time-consuming, and hence, not suitable for online nonlinear filtering. To 
enable the LCD approach for online filtering, it is used to optimally sample only a standard normal 
distribution offline (before filter usage) and transform these samples online (during filter usage) to 
any required Gaussian with the aid of the Mahalanobis transformation [23]. This is the fundamental 
basis for the S^KF [T7|. But other nonlinear estimators such as the Progressive Gaussian Filter also 
make use of this sampling technique. 

However, the current LCD approach can, and will, arrange the samples in an arbitrary way to 
optimally approximate a standard normal distribution. More precisely, it does not take the point 
symmetry of the standard normal distribution explicitly into account. This can lead to a set of 
asymmetric samples. Here, we extend the LCD approach to approximate an A^-dimensional standard 
normal distribution with a set of point symmetric and equally weighted samples. Moreover, we improve 
the numerical stability of the LCD approach to allow approximations of very high dimensions. This 
new optimal point symmetric sampling is then used to obtain a symmetric version of the S^KF. 

The use of point symmetric samples offers several benefits. First, the symmetric samples reflect the 
symmetry of the standard normal distribution allowing for more accurate estimation results as will be 
seen in the evaluation. Second, the used point symmetry makes it possible to capture all odd moments 
of the standard normal distribution exactly (a proof is given in 0. Finally, the required number of 
sample positions that have to be optimized is reduced by half. Consequently, an approximation can 
be computed much faster. Besides point symmetry, other symmetries such as axial symmetry could 
also be exploited. However, this would prevent us from using an arbitrary number of samples and 
would limit the optimizer’s control over the sample placement. 

In the following, we first define the set of parameters describing point symmetric Dirac mixtures 
in Sec. 3.1 These parameters have then to be optimized in order to approximate a standard normal 


distribution in an optimal way. This requires distance measures between a standard normal distribution 
and the point symmetric Dirac mixtures given in Sec. 3.2 Subsequently, the gradients of the distance 


measures are derived in Sec. 3.3 Finally, in Sec. 3.4, we give a procedure to compute point symmetric 


Dirac mixture approximations of standard normal distributions based on the introduced distance 
measures and their gradients. 

3.1. Point Symmetric Dirac Mixtures 


First, we have to modify the generic Dirac mixture (15) to obtain a symmetric one. This is 


performed by distinguishing between an even and odd number of samples. For the case of 2L samples 
with L G 1N_|_, that is, the even case, we place the samples point symmetrically around the state space 
origin yielding the equally weighted Dirac mixture 


1 


'^S{s-Si) + 5{s + Si) , 


i=l 


with sample positions Sj and — Sj. For 2L -|- 1 samples, the odd case, we additionally place a sample 
fixed at the state space origin and obtain the Dirac mixture 


1 


2L + 1 


^{s) + ^ S{s — Sj) -I- 6{s + Sj) 


i=l 
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This preserves the desired point symmetry. As the position of the additional sample in the odd case is 
constant, the set of parameters 

S := 

is the same for both Dirac mixtures. 

Note that the UKF sample set (with equally weighted samples) is a special case of these point 
symmetric Dirac mixtures [9]. With an even number of samples, it has the parametrization 

s, = ViV-e, ViG{l,...,iV} , 

where denotes the unit vector along the i-th dimension. In the odd case, the parametrization is 

sj = VN + 0.5 • Vi G {1,..., A^} , 

that is, the sample spread is larger due to the additional point mass at the state space origin. 


3.2. Distance Measures 

Our goal is to determine the set of parameters S for the above Dirac mixtures so that they 
approximate a multivariate standard normal distribution in an optimal way. This requires a distance 
measure between the involved continuous and discrete distributions. As the classical cumulative 
distribution function is not suitable for the multi-dimensional case [25], we utilize the LCD approach 
in the same way as the asymmetric S^KF. 

Definition 3.1 (Localized Cumulative Distribution |17j). 

Let f{s) be a N-dimensional density function. The corresponding Localized Cumulative Distribution 
is defined as 

F{m,b) = [ fis)-K{s — m,b)ds , 
with m G b G 1R+, and the symmetric and integrable kernel 

K{s — m,b) = exp 

Here, m characterizes the loeation of the kernel and b its size. 



The LCD of an A^-dimensional standard normal distribution is an integral of a product of two 
(unnormalized) Gaussians. By using the fact that the product of two Gaussian distributions is also an 
unnormalized Gaussian and the integral over a probability density equals one, its LCD is obtained 

by [ 20 ] 

Fjg{m,b) = f AA(s ; 0 , Iat) • (27r)^6'^AA(s ; m, 6 ^l 7 v) ds 
Jrn 


(27r)f6^ 


1 


(27r)Ty/|(l +62)1^1 


62 


1 + 62 


exp 


m\2 


} m \\2 
' 2(1 + 62 ) 


exp 


2(1 + 62 ) 


where Itv denotes the identity matrix of dimension N. Based on the Dirac sifting property, the LCD 
of the Dirac mixture comprising an even number of samples is given by 


F!{S,rn,b) 


1 

TL 
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whereas the LCD of the odd Dirac mixture is 


FnS,in,b) = 


1 


2L + 1 


exp 



1 Ui-mh 

2 62 


/ 1 II — s,- — m|L 


To compare the standard normal LCD with a Dirac mixture LCD, we use the modified Cramer-von 
Mises distance defined as follows. 


Definition 3.2 (Modified Cramer-von Mises Distance). 

The modified Cramer-von Mises (CvM) distance D between two LCDs F(rn,b) and F{rn,b) is given 
by 


D{F,F) = J w{b) J (^F{m,b) — F{m,b)^ dm db 


with weighting function 


w{b) = 




vr 2 
0 


, b G [0, bmax\ 
, elsewhere . 


The new term in the weighting function w{b) (in contrast with the definition in [T7j) is a 
consequence of the involved LCDs Fj^f, Ff, and F^. Without this term, the modified CvM distances 
between these LCDs would be unbounded for an increasing dimension N, which in turn would make 
the distances numerically unstable. This improvement now allows the S^KF to compute Dirac mixture 
approximations for very high state dimensions, e.g., N > 200. 

First, we consider the distance between the standard normal distribution and the Dirac mix¬ 
ture comprising an even number of samples, and then extend the results to the odd case. The 
distance D{Fjg-, Ff) can be split into three terms according to 


D{F^, F!) = D%S) = Dl- 2Dl{S) + Dl{S) 


with the sample-independent part 


Jo \l + b^ 


N 

62 


d6 , 


and the sample-dependent terms 


26 ( 262 


^ L 

• J^exp 

2 = 1 


Is-f 

I*2ll2 


2 (1 + 262 ) 


d6 , 


and 


Dl{S) = 


26 


L L 




l|la-«ill2 


262 


+ exp -- 


lk. + %ll+dj. 


262 


The proof is given in[^ Note that the integration over 6 is bounded by 6max due to the support of the 
weighting function w{b). To speed up the distance computation, the following theorem can be applied. 

Theorem 3.1. 

For a given bmax, the following expression for D^{S) can he obtained 


L L 


«!(«) = pip EE % 

’ i=i j=i 


1 


exp 




2 262 


2 262 


8 -^+.2 




where Ei(-) denotes the exponential integral. 
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Proof. The proof is given in 0 


□ 


Now, we consider the case of an odd number of samples. Like in the even case, F^) can be 

split into three terms 

D{Fm, FD = D°{S) = Dl- 2D°^{S) + Dl{S) . 


The first part D° is also independent of the samples S and identical to its even counterpart, i.e., 

Dl = Dl . 

The sample-dependent terms D^^S) and D^{S) can be expressed in terms of the even case plus 
additional terms due to the fixed sample at the state space origin according to 


or p 

+1 


2P 


2L + 1 V 1 -h 262 


d6 


and 


('or'\2 1,2 

D^{S) = 


(2L + 1)2^ 


2{2L + iy 


+ 


f 


46 


E 


exp 


1 kJl2 

2 262 


d6 


(2L + 1P^ 

The proof is given in Like for the even case, also the computation of the odd case can be sped up 
by using the following theorem. 


Theorem 3.2. 

For a given bmax, 


the following expression for D^{S) can be obtained 


»3°(S) = + 


(2L + 1)2' 


2(2L-h 1)2 




E 


6i 


exp 


’j|l2 


(2L + 1)2^ 2 

where Ei(-) denotes the exponential integral. 
Proof. The proof is given in 0 


2 262 


+ xlUJ|2 Ei I —- 


Ell2 


2 262 


□ 


The extra terms in D^iS) and D^(S), compared to the even case, reflect the influence of the 
additional sample, placed at the state space origin, on the distance between the Dirac mixture and the 
standard normal distribution. The result is that the point mass of the additional sample will cause 
the other samples to have a slightly larger spread compared to a sample set without the additional 
sample at the state space origin. Concerning the above mentioned numerical stability, we also give a 
proof for the boundedness of both distances D^{S) and D°{S) in[F} 

3.3. Gradients of the Distance Measures 

In order to optimize the parameters S of a given Dirac mixture, we chose to apply a gradient-based 
iterative optimization procedure. This requires the partial derivatives of the two distance measures 
D^{S) and D°{S) with respect to the set of parameters S. Eor the even case, the partial derivatives 
are 

VdE{l,...,iV} , 



asf 

- aF' 

' aF’ 

with its two terms 




<9D|(5) 


rbmax 26 

f 262 

dsf 

2L J 

L (1 + 262) 

Vl + 262 


exp 


I£ill2 


2{1 + 262 ) 


d6 , 
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and 


dDliS) 


ds 


id) 


{2Lf 


- sf'^)exp 
i=i 


+ sf'^) exp 


^ ll-^i -j\\2 

2 2P 


1 lUi +^ill2 


+ 


d6 . 


2 262 

Analogous to D^{S), the following theorem can be used for the computation of its partial derivatives. 

Theorem 3.3. 

For a given bmax, the following expression for can he obtained 


ds 


dDliS) 


E( 


(2L)- 


(d) _ (<i)^ 


Ei —- 
2 


1 ll«i - s 


■3 Il2 


262 


+ (»f+ 


2 262 


where Ei(-) denotes the exponential integral. 

Proof. The proof is given in 0 

As with the distance D°{S) itself, its partial derivatives 

dD^(S) dD°(S) dD°(S) ^, 

-^ = -2- +VdG TV} 

ds\^^ dsf^ asf) 

can be obtained in terms of the even case plus additional terms according to 

dD^{S) _ 2L dDliS) 


[d) 


and 


ds, 


dD°,{S) _ (2L)2 dDliS) 


2L + 1 Qg(d) 

(d) /-femax 


2s 


1 


ds 


(d) 


(2L+1)2 g^(d) (2L + iyJo 6 


exp 


1 lkJl2 

2 262 


d6 


To ease the computation of the partial derivatives of D^{S), the next theorem can be used. 

Theorem 3.4. 

For a given bmax, the following expression for can be obtained 


ds. 


dDiiS) _ (2L)2 dDliS) 


fd) 


ds 


(d) 


+ 


(2L + 1)2 Qgid) (2L + 1) 


Ei — 


1 lls. 


i\\2 


2 262 


where Ei(-) denotes the exponential integral. 
Proof. The proof is given in 


□ 


□ 


3 . 4 . The S^KF with Symmetric Samples 

After defining the distance measures D‘^{S) and D°{S), including their partial derivatives, we can 
compute a Dirac mixture approximation of a standard normal distribution comprising an arbitrary 
number of optimally placed point symmetric samples. 

To achieve this, we utilize the low memory BEGS quasi-Newton optimization (L-BFGS) [26]. The 
low memory variant is essential here, as it avoids the explicit computation and storage of the Hessian 
matrix. The set of Dirac mixture parameters S encompasses L x N single parameters to be optimized. 
Hence, the Hessian matrix of D®(5) or D°{S) would contain (L x N)'^ entries. When now assuming 
only a linear increase in the number of samples for an increasing dimension N, that is, 2L = C ■ N, 
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Dimension 


Figure 1: Size of the Hessian matrix for different dimensions and number of samples. 


with a linear factor C, the size of the Hessian grows with 0{N'^). This problem is illustrated in 
Fig. a for two different linear factors (5 and 10). It can be seen that approximating a 100-dimensional 
standard normal distribution with a thousand samples would require a Hessian of « 20 gigabytes, and 
already a Hessian of over 4 gigabytes in case of only 500 samples. Consequently, using the Hessian 
directly in the optimization is intractable. 

The computation of the point symmetric samples works as follows. 

1. Choose the desired number of samples M to approximate the A^-dimensional standard normal 
distribution. 

2. Depending on the number of samples, the even or odd distance measure is selected. 

3. A proper maximum kernel width 6max has to be selected. Generally speaking, the larger the 
dimension N is the larger 6max has to be in order to consider all sample positions during the 
optimization, and thus, to get a meaningful approximation. Empirically, we have found that a 
value of 70 is large enough for up to < 1000 dimensions. 

4. An initial set of Dirac mixture parameters is obtained by drawing L samples randomly from an 
A-dimensional standard normal distribution. 

5. The L-BFGS procedure optimizes the Dirac mixture parameters such that the distance measure 
is minimized, i.e., it moves the initial samples in the state space to approximate the standard 
normal distribution in an optimal way. The Dirac mixture parameters resulting from the L-BFGS 
procedure are denoted as 

6. The Dirac mixture approximation given by the set of parameters finally undergoes 

a Mahalanobis transformation so that the transformed Dirac mixture captures the identity 
covariance matrix of the standard normal distribution as much as possible. This is necessary, as 
the proposed distance measures do not explicitly consider the covariance matrix as a constraint. 
The transformation is done by computing the sample covariance matrix 



2 = 1 


and scaling each sample according to 

s, = ^/{C^-z, ViG{l,...,L} , 

where (C^)“^ denotes the Cholesky decomposition of the inverse sample covariance matrix. 
The Dirac mixture defined by the parameters is the final approximation of the standard 

normal distribution. 

Experimentally, we have found that in situations where the covariance matrix was added as an explicit 
constraint to the optimization procedure, the sample covariance matrix of the resulting Dirac mixture 
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(a) Symmetric approach with 12 
samples. 


(b) Symmetric approach with 13 
samples. 


(c) Asymmetric approach with 12 
samples. 


Figure 2: Different LCD-based approximations of a two-dimensional standard normal distribution 
with samples Sj (blue), point symmetric counterparts — Sj (orange), fixed sample at the state space 
origin in the odd case (green), and 95% confidence interval of the standard normal distribution (gray). 


was less accurate compared to the proposed Mahalanobis approach. Moreover, the constraint made 
the optimization procedure much more time-consuming. Consequently, we dropped this approach in 
favor of the Mahalanobis transformation. 

The results of different LCD-based approximations of a two-dimensional standard normal distri¬ 
bution are depicted in Fig. On the one hand. Figures and show approximations using the 
new symmetric sampling scheme comprising 12 and 13 samples, respectively. The point symmetric 
arrangement around the state space origin can be clearly seen. Note also the subtle difference in the 
sample spread of the samples near the state space origin between Fig. 2a and Fig. |2b[ This is caused 
by the additional point mass from the fixed sample at the state space origin. On the other hand. 
Fig. 2c shows an approximation based on the classical asymmetric sampling scheme also comprising 
12 samples. Here, the optimization procedure can position all samples individually, and hence, the 
samples are not necessarily arranged in a point symmetric way like in the depicted case. 

Using the above described optimal point symmetric sampling of a standard normal distribution, 
we obtain a symmetric version of the S^KF. Furthermore, to avoid a re-computation on every program 
start, we store any computed Dirac mixture approximation of a standard normal distribution persistent 
in the file system for later reuse. This mechanism is called the Sample Cache and was already used by 
the asymmetric S^KF. 


4. Evaluation 

In this Section, we want to compare the new point symmetric sampling scheme of the S^KF 
with its asymmetric version and other state-of-the-art LRKFs. First, we take a closer look at the 
approximation of higher-order moments of standard normal distributions. Then, the advantage of 
using a point symmetric sampling scheme, and hence, the new version of the S^KF, is discussed by 
means of a simple symmetric measurement equation. Finally, extended object tracking is performed 
to compare the recursive state estimation quality of various state-of-the-art LRKFs. 

4-1. Moment Errors of a Standard Normal Distribution 

First, we investigate how well the employed sampling schemes of state-of-the-art LRKFs approx¬ 
imate the moments of a standard normal distribution. Thus, we are interested in the expectation 
values 

K[xf^X 2 ^ ■ ■ ■ x^] = [ xf^xlf^ ■ ■ ■ x^M{x;0,lj\f) dx , 
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with 


N 

rii = m , 0 < Tii < m 

i=l 

for different dimensions N and moment orders m. This has the advantage of being independent of a 
concrete system and measurement model. Note that for given N and m, there exists N™ possible 
combinations to select the values for n,. Hence, a moment is characterized by values. 

As all state-of-the-art LRKFs employ a point symmetric sampling scheme and capture mean and 
covariance matrix, we focus on higher-order even moments. More precisely, we take a look at the 
4th, 6th, and 8th moment, i.e., m G {4,6,8}. In many practical applications, 3D and 6D Gaussian 
distributions are of special interest. For example, the location and orientation in 2D or the position in 
3D can be estimated using a three-dimensional system state. When additionally considering velocities 
in the 2D case or the orientation in the 3D case, a six-dimensional state is required. Thus, we chose to 
study the approximations of standard normal distributions with these two dimensions, i.e., N G {3,6}. 

To compare the different LRKF sampling techniques, for each dimension N and moment m we 
compute a normalized moment error according to 


\ 


1 


AT” 


^(jgtrue _ ]gLRKF^2 ^ 
i=i 


where Ej denotes one of the AI™ possible combinations for the mth moment, the superscript ’’true” 
the true moment value and ”LRKF” the LRKF sampling estimate. Note that the 8th moment of the 
6D standard normal distribution is characterized by already over 1.5 million combinations. 

We compare the new symmetric S^KF, the UKF with equally weighted samples, the RUKF, the 
fifth-degree CKF, and the GHKF with two quadrature points. As the symmetric S^KF and the RUKF 
do not have unique sample sets, we perform 100 Monte Garlo runs for both filters. In each run, for 
both filters new sample sets are generated, and the average moment error over all runs are computed. 
Moreover, the S^KF and the RUKF are evaluated with different number of samples. The results are 
depicted in Figures and As the UKF, the fifth-degree CKF, and the GHKF have a hxed number 
of samples, they are depicted as a bar at their respective employed number of samples. 

The moment errors of the UKF and the S^KF are identical for the case when both filters use 
the same number of samples. This is due to the fact that both sample sets are equally weighted 
and the S^KF places its samples like the UKF (except for the rotation) as this minimize the utilized 
distance measure. The RUKF, however, scales the utilized UKF sample sets randomly. Consequently, 
its sample set is not necessarily equally weighted like for the UKF and the S^KF, and hence, their 
moment errors differ. Considering all moments, the S^KF delivers always smaller errors than the 
RUKF and for nearly all moments smaller errors than the GHKF (for the same number of samples). 
The sampling of the fifth-degree CKF is the only one that matches the 4th moment exactly. This is 
based on the fact that the spherical-radial rule of the fifth-degree CKF has a 5th-degree accuracy |12j . 


4-2. Symmetric Measurement Equations 

To illustrate the advantages of using a symmetric sampling scheme, we consider the two-dimensional 
system state 

X = [a, b]^ 

combined with the scalar and symmetric measurement equation 

y = h{x, v) = \/a? -|- V , 

where v is zero-mean Gaussian noise with variance = 0.01. Hence, we measure a noisy distance 
from the system state x to the state space origin. Such a symmetric measurement equation arises for 
example in [271 [28]. 
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(b) Error 6th moment. 



Figure 3: Moment errors of a 3D standard normal distribution. 
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(a) Error 4th moment. (b) Error 6th moment. 
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(c) Error 8th moment. 


Figure 4: Moment errors of a 6D standard normal distribution. 


We assume that the true system state is 

Nitrile [^) ^] ) 

and our goal is to estimate it using a Nonlinear Kalman Filter initialized with mean and covariance 
matrix 

xP = [0, 0]^, CP = diag(4,0.5) . 

The setup is illustrated in Fig. From the the estimator’s perspective, the received measurement y 
could stem from any state located on the gray circle around the prior mean, not only Hence, a 

Nonlinear Kalman Filter cannot gain any new information about the hidden system state from the 
measurement y. This situation is reflected in a zero cross-covariance matrix of state and measurement 
C^’P in Q and Q. Consequently, the posterior state estimate (mean and covariance matrix) equals 
the prior, no matter what prior uncertainty we have. 

Now, we try to reproduce this result when using LRKFs. More precisely, we compare the 
asymmetric S^KF, its new symmetric version (both using 11 samples), and the UKF. We perform 
R = 100 Monte Carlo runs. In each run, we reset the initial state estimate, and simulate a noisy 
measurement y to perform one measurement update. Moreover, both S^KF variants compute a new 
set of samples approximating a standard normal distribution in every Monte Carlo run. We compute 
the Root Mean Square Error (RMSE) for the posterior mean 



R 


E 
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Figure 5: Symmetric measurement model with prior mean (orange), prior uncertainty (blue), true 
system state (green). 


where denotes the estimated posterior mean of run r. Additionally, we compute the RMSE of the 
posterior covariance matrix 


\ 


1 ^ 
r=l 


c® - cp\ 


where C® denotes estimated posterior covariance matrix of run r and || • || the Frobenius norm. 

The results of the evaluation are depicted in Fig. It can be seen that the asymmetric S^KF 
is the only filter with a small error. This can be explained with its asymmetric sampling scheme, 
as it makes a system state on the circle around the prior mean more likely than others. Hence, it 
introduces (theoretically non-existent) correlations between the measurement and the system state, 
i.e., a non-zero cross-covariance matrix. As a consequence, the asymmetric S^KF updates its state 
estimate mistakenly (albeit not that much). The other estimators do not have such a problem due to 
their symmetric sampling. So even such a simple scenario demonstrates the advantages of the new 
point symmetric sampling scheme of the S^KF. 


4-3. Extended Object Tracking 

Now, we consider estimating the pose and extent of a cylinder in 3D based on a Random 
Hypersurface Model (RHM) [29l[3n]. The system state is composed of position Cf. = [c|, c^, and 
velocity^;. = rotation angles (/>^ = and their velocities W;. = [a;|, as 

well as the cylinder radius and length 4 according to 

The temporal evolution of the cylinder is modeled with a constant velocity model 


Xk = w , 


with system matrix 



Is 

Is 

0 

0 

0 


0 

0 

h 

0 

0 


0 

0 

h 

h 

0 


0 

0 

0 

0 

h 


and zero-mean Gaussian white noise w with covariance matrix 


= diag(10-®l3,10-%, 10-i°l2,10-^12, 10 -^ 12 ) . 

This linear model allows to compute the prediction step analytically for all LRKFs. 
A measurement is a noisy point 

Ik = irk, vi ^kV 
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Figure 6 : Estimation errors for symmetric measurement model. 


from the cylinder’s surface. It is related to the system state by means of the implicit nonlinear 
measurement equation 


Q = h{xk,y^,v,s) 


+ (m^)2 - rf 
ml - s-Ik 

_ {ml-s- hf _ 


(18) 


where 


m • ^i^k)) Hk-i^- ^k) 


and zero-mean Gaussian white noise u with covariance matrix C’’ = 0.01 • I 3 and multiplicative white 
noise s ~ ^(—0.5,0.5) Furthermore, R(-) denotes a 3D rotation matrix around the respective axis. 
It is important to note that the measurement equation itself depends on the received measurement 
and the estimator only takes the so-called pseudo measurement 0 as input. The reason for this is 
that the proposed measurement model tries to minimize the Euclidean distance between the received 
measurements and the cylinder’s surface, and thus, generates measurements of value zero in the 


optimal case. Note also that the quadratic term in the last row of (18) is necessary when dealing with 
multiplicative noise in combination with Kalman Filters |31l I29j . 

At each time step, we receive a set of 20 measurements 


As the order of processing measurements affects the filtered state estimate, we do not process 
measurements sequentially. More precisely, we process all measurements at once, that is, in a single 
measurement update, by stacking the measurements into a large measurement vector according to 


‘o' 


hixk,T^k^,y^^\s^^^) 

0 




This, in turn, requires a set of 20 • 4 = 80 measurement noise variables in total. Together with the 
twelve-dimensional system state, a LRKF has to sample a 92-dimensional random vector to perform a 
measurement update. The number of samples used by the investigated LRKFs are summarized in 
Table It should be noted that the GHKF HH is intractable for the considered scenario as it relies 
on a Gartesian product and would require at least 2®^ samples. 

We simulate a nonlinear trajectory of a cylinder over 500 time steps including rotations in all its 
three degrees of freedom as depicted in Fig. Additionally, the initial cylinder’s length of 1 increases 
to 1.5 after 200 time steps, and the initial radius of 0.3 increases to 0.4 after further 100 time steps. 


^As LRKFs can only sample Gaussian distributions, the uniform distribution will be approximated as a Gaussian 
using moment matching. 
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LRKF 

Number of samples 



Fifth-degree CKF 

2 • 92^ -M 

16, 

929 

RUKF (with 5 iterations) 

5 • (2 • 92) -b 1 


921 

RUKF (with 20 iterations) 

20 • (2 • 92) -b 1 = 

1, 

841 

Asymmetric S^KF 

Freely selectable 


461 

Asymmetric S^KF 

Freely selectable 

1, 

841 

Symmetric S^KF 

Freely selectable 


461 

Symmetric S^KF 

Freely selectable 

1, 

841 


Table 1: LRKF settings for the measurement update. 


Finally, at time step 400, the cylinder’s length shrinks back to 0.5. We perform 100 Monte Carlo runs. 
In reach run, we initialize the estimators with 


= [cT,0,...,0,1,2]T 

C3 = diag(C'=, lO-^Ig, 10-^1 


4, 10 ^ 12 ) 


where c denotes the mean and C'’’ the covariance of the first set of measurements To- For each 
investigated LRKF, we compute the cylinder position RMSE (Fig.J^), the RMSE of the angle between 
the true cylinder longitudinal axis and the estimated one (Fig. 8b), as well as the cylinder volume 
RMSE (Fig. 8c). Regarding the cylinder position, the RUKF instances were the Liters with the 


largest errors although they used the same or twice the number of samples of the S^KF variants. The 
asymmetric S^KF was a little bit less accurate than the symmetric S^KF and the fifth-degree CKF. 
Same results can be observed for the cylinder orientation error. For the cylinder volume error, all 
estimators had noticeable error peaks at time steps 200, 300, and 400. These can be explained with 
the abrupt shape changes of the cylinder at the respective time steps. Furthermore, the fifth-degree 
CKF is not as good as in the other estimation quality criteria, and also the asymmetric S^KF is 
slightly better than the symmetric S^KF in the beginning. 

However, when looking at the runtimes of the respective LRKF measurement updates in Fig 


8d 


the hfth-degree CKF was the slowest filter due to its large amount of samples. The runtimes of the 
asymmetric and the symmetric S^KF were nearly identical as they used the same number of samples. 
For the case when the RUKF and the S^KF variants used the same number of samples, the RUKF 
was slower (11.5 ms compared to 4.5 ms) due to the additional overhead resulting from the creation 
of several 92-dimensional random orthogonal matrices during each measurement update. All in all, 
both S^KF variants were the filters yielding the best compromise between runtime performance and 
estimation accuracy. Moreover, this illustrates the advantage of being able to select the number of 
samples independently of the state/noise dimensions, in contrast to the hfth-degree CKF. 


5. Conclusions 

In this paper, we introduced a new point symmetric Gaussian sampling scheme for the Smart 
Sampling Kalman Filter. This rehects the point symmetry of the Gaussian distribution and allows for 
matching all odd moments of a standard normal distribution exactly. The new sampling technique 
does not only improve the estimation quality of the S^KF, it also speeds up the computation of the 
Gaussian samples as the number of Dirac mixture parameters to be optimized is reduced by half. 

After describing the structure of a sample-based Kalman Filter, we extended the general Dirac 
mixture to a point symmetric form by distinguishing between an even and an odd number of samples. 
Then, we adapted the existing LGD distance measure to these new Dirac mixtures and also gave 
formulas for their respective gradients. These are required by the iterative optimization procedure 
which optimizes the Dirac mixture parameters to optimally approximate a multi-dimensional standard 
normal distribution with a set of equally weighted point symmetric samples. Furthermore, we improved 


18 











Volume RMSE Position RMSE 


3 


TH 



Figure 7: Cylinder state (blue) after 360 time steps inclusive its trajectory (green line) and 20 noisy 
measurements (orange crosses). 



(a) Cylinder position error. 



(b) Cylinder orientation error. 



Figure 8: Cylinder tracking simulation results. 
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the numerical stability of the optimization, and together with the halved number of Dirac mixture 
parameters to be optimized, now it is possible to compute optimal approximations of a thousand¬ 
dimensional standard normal distribution comprising 10,000 samples. As the Progressive Gaussian 
Filter (PGF) [6] also relies on the S^KF Gaussian sampling, it can directly use and beneht from the 
new point symmetric sampling scheme. 

The evaluations showed that the S^KF can handle symmetric measurement equations now much 
better when using the new symmetric sampling scheme. It was also shown that the S^KF gave the 
best compromise between estimation accuracy and filter runtime when dealing with high-dimensional 
problems such as extended object tracking. Additionally, this illustrated the advantage of the S^KF 
being able to use an arbitrary number of samples independent of the state/noise dimensions. 


Appendix A. Odd Moments of a Point Symmetric Dirac Mixture 

The odd moments of an arbitrary density function f{x) with x G IR'^ are dehned as 


N 

E/in 

1=1 


x-l = 


N 

/ n 


Uj = 2k + 1, 0 <nj < 2k+ l,k£fi . 

1=1 

For a standard normal distribution, i.e., f{x) = AA(x;0, Iw), all odd moments equals zero. Hence, we 
have to show that this also holds for a point symmetric Dirac mixture density function comprising 
2L samples. By replacing the density f{x) with a point symmetric Dirac mixture approximation we 
obtain 


N 




X/\ = 


1=1 


N L 


1 


2L 


L f N N 

1=1 



N 


E n<i-n<^' =0 


*=i \i=i 


1=1 


The same result can be easily obtained for the case of an odd number of samples 2L -|- 1 where the 
additional sample is placed at the state space origin. 


Appendix B. Proof of Distance D^{S) 

By using the facts that the distance D^{S) is composed of sums of products of unormalized 
Gaussians and their product is also an unnormalized Gaussian as well as the integral over a Gaussian 
equals always one, the three terms of the distance D^{S) are obtained according to 
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Appendix C. Proof of Theorem 3.1| 

Like in [20], to compute the term D^{S) we use that for z > 0 


/•bmax 2 ^ , 1 z , 


(C.l) 


where Ei(x) is the exponential integral defined as 


rx gt 

Ei(x) = / — dt 

1 — 00 ^ 


Moreover, the product rule gives 
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and together with (C.l) we obtain 
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This directly resnlts in the expression 
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Appendix D. Proof of Distance D°{S) 

The distance D°{S) differs from its even counterpart due to the additional sample placed fixed at 
the state space origin. This does not effect D°, and hence, it equals Df. The other two terms are 
sums of their reweighted even counterparts (due to the changed sample weight) and terms comprising 
also products of unnormalized Gaussians. Hence, they are given as 
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Appendix E. Proof of Theorem |3.2| 


A closed-form expression for D^{S) can directly be obtained by using again (C.2) as well as the 
closed-form expression for D^{S) resulting in 


DiiS) = 


{2Lf 


(2L + 1)2 


Dl{S) + 


h'L. 


2{2L + lY 


+ 


_ t y 

(2L-b 1)2 ^ 

i=l 


bL 


exp 


I ^2 112 


2 2b^ 


+ o Il^ill2 “n 


1 lls. 


Uill2 


2 262 


Appendix F. Boundedness of D^{S) and D°{S) 

We show the boundedness of the distances D^{S) and D°{S) for an increasing dimension N. For 
a given 6 niax it holds 
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Hence, the distance D^{S) is bounded by 6 max according to 


hm D^S) = hm Dl - 2Z1|(5) + DUS) < ^ 
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In a similar manner, the same result can be obtained for the distance D°{S). 


Appendix G. Proof of Theorem 3.3 


A closed-form expression for can directly be obtained using (C.l) resulting in 


ds. 


dDliS) 


ds 


id) 


(2i)' 


E(< 

i=i 




I ^ (d) , 

+ (s) + S 


+ Ei 


23 
































Appendix H. Proof of Theorem 3.4 


A closed-form expression for can directly be obtained using (C.l) and the closed-form 


expression 


resulting in 
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